SEQUENCE LISTING ' ~->4 

(1) GENERAL INFORMATION: 



(i) APPLICANT: 



Ullrich, Axel 
Aoki, Naohito 
Kim, Yeong Woong 
Wang, Hong Yang 
Chen , Zheng j un 
Nay lor, Oliver 

Kharitonenkov, Alexei Igorevich 



(ii) TITLE OF INVENTION: 



NOVEL PTP20, PCP-2, BDP1, CLK, 
AND SIRP POLYPEPTIDES AND RELATED 
PRODUCTS AND METHODS 



(iii) NUMBER OF SEQUENCES: 



(iv) CORRESPONDENCE ADDRESS: 



(A) 
(B) 



(C) 
(D) 
(E) 
(F) 



CITY : 
STATE : 
COUNTRY: 
ZIP: 



Lyon & Lyon 

633 West Fifth Street 

Suite 4700 

Los Angeles 

California 

U.S.A. 

90071-2066 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 

(B) COMPUTER: 

(C) OPERATING SYSTEM: 

(D) SOFTWARE : 



3.5" Diskette, 1.44 Mb 
storage 

IBM Compatible 
IBM P.C. DOS 5.0 
FastSEQ for Windows 2.0 



(vi) CURRENT APPLICATION DATA: 



(A) 


APPLICATION NUMBER: 


08/877, 150 


(B) 


FILING DATE: 


June 17, 1997 


(C) 


CLASSIFICATION: 




PRIOR APPLICATION DATA: 




(A) 


APPLICATION NUMBER: 


U.S. 60/019,629 


(B) 


FILING DATE: 


June 17, 1996 


(A) 


APPLICATION NUMBER: 


U.S. 60/023,485 


(B) 


FILING DATE: 


August 9, 1996 


(A) 


APPLICATION NUMBER: 


U.S. 60/030,860 


(B) 


FILING DATE: 


November 13, 1996 
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APPLICATION NUMBER: 
FILING DATE: 



APPLICATION NUMBER: 
FILING DATE: 



U.S. 60/034,286 
December 19, 1996 



U.S. 60/030,964 
November 15, 1996 



(viii) ATTORNEY/AGENT INFORMATION : 

(A) NAME : 

(B) REGISTRATION NUMBER: 

(C) REFERENCE/ DOCKET NUMBER: 



Warburg, Richard J. 

32,327 

225/298 



(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (213) 489-1600 

(B) TELEFAX: (213) 955-0440 

(C) TELEX: • 67-3510 



(2) INFORMATION FOR SEQ ID NO: 1: 
(i) SEQUENCE CHARACTERISTICS: 



LENGTH: 
TYPE: 

STRANDEDNESS : 
TOPOLOGY: 



6 amino acids 
amino acid 
single 
linear 



(ii) 
(ix) 



MOLECULE TYPE: 
FEATURE: 

(D) OTHER INFORMATION 



peptide 



"Xaa" in positions 3 and 5 stands 
for an unspecified amino acid. 



(xi) SEQUENCE DESCRIPTION: 
Phe Trp Xaa Met Xaa Trp 



INFORMATION FOR SEQ ID NO : 2 : 
(i) SEQUENCE CHARACTERISTICS: 



(ii) 
(ix) 



(A) 

(B) TYPE: 

(C) STRANDEDNESS: 

(D) TOPOLOGY: 



7 amino acids 
amino acid 
single 
linear ' 



MOLECULE TYPE: 
FEATURE: 

(D) OTHER INFORMATION: 



peptide 



"Xaa" in position 6 stands for 
either Ser, He or Val . 
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(xi) SEQUENCE DESCRIPTION: SEQ ID I 
His Cys Ser Ala Gly Xaa Gly 



(2) INFORMATION FOR SEQ ID NO : 3: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 

Phe Leu Glu Arg Leu Glu 



(2) INFORMATION FOR SEQ ID NO: 4: 
(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 

(B) TYPE: 

(C) STRANDEDNESS: 

(D) TOPOLOGY: 



6 amino acids 
amino acid 
single 
linear 



(ii) 
(ix) 



MOLECULE TYPE: 
FEATURE : 

(D) OTHER INFORMATION 



peptide 



"Xaa" in positions 3 and 5 stands 
for an unspecified amino acid. 



(xi) SEQUENCE DESCRIPTION: 
Arg Trp Xaa Met Xaa Trp 



(2) INFORMATION FOR SEQ ID NO : 5: 
(i) SEQUENCE CHARACTERISTICS: 



. (A) LENGTH : 

(B) TYPE: 

(C) STRANDEDNESS: 

(D) TOPOLOGY: 

(ii) MOLECULE TYPE: 



7 amino acids 
amino acid 
single 
linear 

peptide 
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(ix) FEATURE: 

(D) OTHER INFORMATION: "Xaa" in position 6 stands for 
either Ser, lie or Val . 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5: 

His Cys Ser Ala Gly Xaa Gly 



(2) INFORMATION FOR SEQ ID NO : 6: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : ( 
CTCTGTGTCC ACAGCAGTGC TGGCTGT 



(2) INFORMATION FOR SEQ ID NO: 7: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 
His Arg Asp Leu Ala Ala Arg 



(2) INFORMATION FOR SEQ ID NO : 8 : 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
( ix) FEATURE : 

(D) OTHER INFORMATION: "Xaa" in position 2 stands for 
Val or Met. "Xaa" in position 
5 stands for Tyr or Phe. 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Asp Xaa Trp Ser Xaa Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 9: 
(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CGGGATCCCT TCGCCTTGCA GCTTTGTC 

(2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 
CGGAATTCCT AGACTGATAC AGTCTGTAAG 

(2) INFORMATION FOR SEQ ID NO: 11: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii> MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 

Asp Leu Lys Pro Glu Asn 
1 5 

(2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 

Ala Met Met Glu Arg He 
1 5 



(2) INFORMATION FOR SEQ ID NO: 13: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
'(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 

TATAGCGGCC GCTAGACTGA TACAGTCTGT 



(2) INFORMATION FOR SEQ ID NO: 14: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 
TCCCCCGGGA TGCCCCATCC CCGAAGGTAC CA 



(2) INFORMATION FOR SEQ ID NO: 15: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 
TATAGCGGCC GCTCACCGAC TGATATCCCG ACTGGAGTC 



(2) INFORMATION FOR SEQ ID NO: 16: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
TCCCCCGGGG AGACGATGCA TCACTGTAAG 



(2) INFORMATION FOR SEQ ID NO: 17: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TATAGCGGCC GCGCTGGCCT GCACCTGTCA TCTGCTGGG 



(2) INFORMATION FOR SEQ ID NO: 18: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
CGGAATTCAT GCGGCATTCC AAACGAACTC 



(2) INFORMATION FOR SEQ ID NO : 19: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
TATAGCGGCC GCCCTGACTC CCACTCATTT CCTTTTTAA 



(2) INFORMATION FOR SEQ ID NO : 20: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 
CGGAATTCCG CCACCATGGC CCCTATACTA GGTTAT 



(2) INFORMATION FOR SEQ ID NO: 21: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 21 
GCCAAGCTTG CCACCATGGC CCCTATACTA GGTTAT 



(2) INFORMATION FOR SEQ ID NO: 22: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 
GTAGCAGTAA GAATAGTTAA A 



(2) INFORMATION FOR SEQ ID NO: 23: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid , 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
GTTGCC CTGA GGATCATTAA GAAT 



(2) INFORMATION FOR SEQ ID NO: 24: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
GTTGCCCTGA GGATCATCCG GAAT 

(2) INFORMATION FOR SEQ ID NO : 25: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
TACAATTCTC ACTGCTACAT GTAAGCCATC 



(2) INFORMATION FOR SEQ ID NO: 26: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Pro lie Tyr Ser Phe lie Gly Gly Glu His Phe Pro Arg 
15 10 



(2) INFORMATION FOR SEQ ID NO: 27: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : : 
He Val Glu Pro Asp Thr Glu He Lys 
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(2) INFORMATION FOR SEQ ID NO: 28: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
Tyr Gly Phe Ser Pro Arg 



(2) INFORMATION FOR SEQ ID NO: 29: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
lie Lys Glu Val Ala His Val Asn Leu Glu Val Arg 



(2) INFORMATION FOR SEQ ID NO: 30: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 
Val Ala Ala Gly Asp Ser Ala Thr 



(2) INFORMATION FOR SEQ ID NO: 31: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 2 S base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

GAATTCCGGC ACGAGGCGGG TTGCAGTATG AGTCGCCAAT CGG AC CTAGT GAGGAGCTTC 60 

TTGGAGCAGC AGGAGGCCCG GGACCACCGG AAGGGGGCAA TCCTCGCCCG TGAGTTCAGC 12 0 

GACATTAAGG CCCGCTCAGT GGCTTGGAAG ACTGAAGGTG TGTGCTCCAC TAAAGCCGGC 180 

AGTCAGCAGG GAAACTCAAA GAAGAACCGC TACAAAGACG TGGTACCGTA TGATGAGACG 24 0 

AGAGTCATCC TTTCCCTGCT CCAGGAGGAA GGACACGGAG ATTACATTAA TGCCAACTTC 3 00 

ATCCGGGGCA CAGATGGAAG CCAGGCCTAC ATTGCGACGC AAGGACCCCT GCCTCACACT 3 60 

CTGTTGGACT TCTGGCGCCT GGTTTGGGAG TTTGGAATCA AGGTGATCTT GATGGCCTGT 42 0 

CAGGAGACAG AAAATGGACG GAGGAAGTGT GAACGCTACT GGGCCCAGGA GCGGGAGCCT 48 0 

CTACAGGCCG GGCCTTTCTG CATCACCCTG ACAAAGGAGA CAGCACTGAC TTCGGACATC 54 0 

ACTCTCAGGA CCCTCCAGGT TACATTCCAG AAGGAATCCC GTCCTGTGCA CCAGCTACAG 600 

TACATGTCTT GGCCGGACCA CGGGGTTCCC AGCAGTTCCG ATCACATTCT CACCATGGTG 660 

GAGGAGGCCC GTTGCCTCCA AGGACTTGGA CCTGGACCCC TCTGTGTCCA CTGCAGTGCT 72 0 

GGCTGTGGAC GAACAGGTGT CTTGTGTGCT GTTGATTACG TGAGGCAGTT GCTTCTGACT 780 

CAGACAATCC CACCCAATTT CAGCCTCTTT GAAGTGGTCC TGGAGATGCG GAAACAGCGA 84 0 

CCTGCAGCGG TGCAGACAGA GGAGCAGTAC AGGTTCCTGT ACCACACAGT GGCTCAGCTA 900 

TTCTCCCGCA CTCTCCAGAA CAACAGTCCC CTCTACCAGA AC CTCAAGGA GAACCGCGCT 960 

CCAATCTGCA AGGACTCCTC GTCCCTCAGG ACCTCCTCAG CCCTGCCTGC CACATCCCGC 102 0 

CCACTGGGTG GCGTTCTCAG GAGCATCTCG GTGCCTGGGC CACCGACCCT TCCCATGGCT 108 0 

GACACTTACG CTGTGGTGCA GAAGCGTGGC GCTTCCGGCA GCACAGGGCC GGGCACGCGG 114 0 

GCGCCCAACA GCACGGACAC CCCGATCTAC AGCCAGGTGG CTCCACGTAT CCAGCGGCCC 12 00 

GTGTCACACA CCGAAAACGC GCAGGGGACA ACGGCACTGG GCCGAGTTCC TGCGGATGAA 12 60 

AACCCTTCCG GGCCTGATGC CTATGAGGAA GTAACAGATG GAGCGCAGAC TGGTGGGCTA 1320 

GGCTTCAACT TGCGCATTGG AAGACCTAAA GGGCCACGGG ATCCTCCAGC GGAGTGGACA 13 80 

CGGGTGTAAT GAGTGCTGTA CCAGTTCCAG CCTGTCACTC AGTGGTGGCT GGGCGACTGC 1440 

AACCCCCATG CTGCTGTGTG CTGTCTTATG TATGAGTGGG ACTCATGGGC CTGAATCAAA 1500 

ATAAAAGTTT CTCAGGGTAG AAAAAAACAA ATAGGGACTT TGGCCAGTGG TTATAGCAGT 1560 

CAAAGCCAGG GGCTAGGAGG GGTAAGTGGG GGAGGTGGTG GATCTACTCT GAGAAAGTTT 1620 

AGGAAAGCAC ATCAAGAGTG AGCATCGCCA CTCTTCTCCC CATACACCTA CTGGAAAGTG 1680 

CACCCCAGAC AGAGTCCTAA CTTGACAGTG CACCTCAGAC AGGTCGCTAC CTGGATGGAC 174 0 

ATGCTGGCCC TACAGCTAGA GACATGTCTA ATTAGATCCT CATGTAAACT TGCAATGAGC 1800 

TAGAAAGATC TCCGTCTGGT CAGGGAAATG GATCACCTAG TCAGGTAAAT AGTGTGCCAT 1860 

CCAGAAGACA GAACTGCAAG ATACCGTCTT TCTCAAAATG GAAGAAAATA GATCCTCAAG 192 0 

AATAAATGTA TGTACAATGC TCTACGCCCT GATCCTGCCC TGCCTCACTG CCATAATGTC 1980 

ACAAACAAGT CAGGGTCTAT ATGACAGTTG TTCATCTAGT CAGTCCTGAC TGTGGCCTCT 2040 

GCAGGCTCAG ATAGTGCCTT CTGCAGACTC TTGGAATGCC CGTCTTGAAC TTGATGAAAG 2100 

CTTCTACCGG GAACTTGTAA ACATCATTAA AATTATTAAT GTAGAATTCA ATAAAGAGTG 2160 

GGTCAAAAAC TCAAAAAAAA AAAAAAAAAA AAAAAAAAAC TCGAGAGTAC TTCTAGAGCG .2220 

GGCGGG 2226 



(2) INFORMATION FOR SEQ ID NO : 32: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 453 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID- NO : 32: 

Met Ser Arg Gin Ser Asp Leu Val Arg Ser Phe Leu Glu Gin Gin Glu 
15 10 15 

Ala Arg Asp His Arg Lys Gly Ala lie Leu Ala Arg Glu Phe Ser Asp - 
20 25 30 
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He Lys Ala Arg Ser Val Ala Trp Lys Thr Glu Gly Val Cys Ser Thr 
35 40 45 

Lys Ala Gly Ser Gin Gin Gly Asn Ser Lys Lys Asn Arg Tyr Lys Asp 
50 55 60 

Val Val Pro Tyr Asp Glu Thr Arg Val He Leu Ser Leu Leu Gin Glu 
65 70 75 80 

Glu Gly His Gly Asp Tyr He Asn Ala Asn Phe He Arg Gly Thr Asp 
85 90 95 



Leu Asp Phe Trp Arg Leu Val Trp Glu Phe Gly He Lys Val He Leu 
115 120 125 

Met Ala Cys Gin Glu Thr Glu Asn Gly Arg Arg Lys Cys Glu Arg Tyr 
130 135 140 

Trp Ala Gin Glu Arg Glu Pro Leu Gin Ala Gly Pro Phe Cys He Thr 
145 150 155 160 

Leu Thr Lys Glu Thr Ala Leu Thr Ser Asp He Thr Leu Arg Thr Leu 
165 170 175 

Gin Val Thr Phe Gin Lys Glu Ser Arg Pro Val His -Gin Leu Gin Tyr 
180 185 190 

Met Ser Trp Pro Asp His Gly Val Pro Ser Ser Ser Asp His He Leu 
195 " 200 205 

Thr Met Val Glu Glu Ala Arg Cys Leu Gin Gly Leu Gly Pro Gly Pro 
210 215 220 

Leu Cys Val His Cys Ser Ala Gly Cys Gly Arg Thr Gly Val Leu Cys 
225 230 235 240 

Ala Val Asp Tyr Val Arg Gin Leu Leu Leu Thr Gin Thr He Pro Pro 
245 250- 255 

Asn Phe Ser Leu Phe Glu Val Val Leu Glu Met Arg Lys Gin Arg Pro 
260 265 270 

Ala Ala Val Gin Thr Glu Glu Gin Tyr Arg Phe Leu Tyr His Thr Val 
275 280 285 

Ala Gin Leu Phe Ser Arg Thr Leu Gin Asn Asn Ser Pro Leu Tyr Gin 
290 295 300 

Asn Leu Lys Glu Asn Arg Ala Pro He Cys Lys Asp Ser Ser Ser Leu 
30S 310 315 320 

Arg Thr Ser Ser Ala Leu Pro Ala Thr Ser Arg Pro Leu Gly Gly Val 
325 330 335 

Leu Arg Ser He Ser Val Pro Gly Pro Pro Thr Leu Pro Met Ala Asp 
340 345 350 
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Thr Tyr Ala Val Val Gin Lys Arg Gly Ala Ser Gly Ser Thr Gly Pro 
355 360 365 

Gly Thr Arg Ala Pro Asn Ser Thr Asp Thr Pro lie Tyr Ser Gin Val 
370 375 380 

Ala Pro Arg lie Gin Arg Pro Val Ser His Thr Glu Asn Ala Gin Gly 
38S 390 395 400 

Thr Thr Ala Leu Gly Arg Val Pro Ala Asp Glu Asn Pro Ser Gly Pro 
405 410 415 

Asp Ala Tyr Glu Glu Val Thr Asp Gly Ala Gin Thr Gly Gly Leu Gly 
420 425 430 

Phe Asn Leu Arg lie Gly Arg Pro Lys Gly Pro Arg Asp Pro Pro Ala 
435 440 445 

Glu Trp Thr Arg Val 
450 



(2) INFORMATION FOR SEQ ID NO: 33: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 5581 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

AATTCCGGGC GCCAGTCCCG CTCCGCGCCG CGCCGCTCCG CTCCGGCTCG GGCTCCGGCT 60 

CGCCTCGGGC TGGGCTCGGG CTCCGGGGGC GGCGTCCCCG CGCCGGGCCC CGGGACGCGC 12 0 

CGACCTCCAA CCATGGCCCG TGCCCAGGCG CTCGTGCTGG CACTCACCTT CCAGCTCTGC 18 0 

GCGCCGGAGA CCGAGACTCC GGCAGCTGGC TGCACCTTCG AGGAGGCAAG TGACCCAGCA 24 0 

GTGCCCTGCG AGTACAGCCA GGCCCAGTAC GATGACTTCC AGTGGGAGCA AGTGCGAATC 3 00 

CACCCTGGCA CCCGGGCACC TGCGGACCTG CCCCACGGCT CCTACTTGAT GGTCAACACT 360 

TCCCAGCATG CCCCAGGCCA GCGAGCCCAT GTCATCTTCC AGAGCCTGAG CGAGAATGAT 42 0 

ACCCACTGTG TGCAGTTCAG CTACTTCCTG TACAGC CGGG ACGGCACAGG CGGCACCCTG 480 

CGCGTCTACG TGCGCGTTAA TGGGGGCCCC CTGGCGAGTG CTGTGTGGAA TATGACTGGA 54 0 

TCCCACGGCC GTCAGTGGCA CCAGGCTGAG CTGGCTGTCA GCACTTTCTG GCCCAATGAA 600 

TATCAGGTGC TGTTTGAGGC CCTCATCTCC CCAGACCGCA GGGGCTACAT GGGCCTAGAT 660 

GACATCCTGC TTCTCAGCTA CCCCTGCGCA AAGGCCCCAC ACTTCTCCCG CCTGGGCGAC 72 0 

GTGGAGGTCA ACGCGGGCCA GAACGCGTCG TTCCAGTGCA TGGCCGCGGG AGAGCCCATG 78 0 

CGCCAACGCT TCCTCTTGCA ACGGCAGAGC GGGGCCCTGG TGCCGGCCGG GGCGTTCGGC 840 

ACATCAGCCA CCGGCTTCCT GGCCACTTTC CCGCTGGCTG CCGTGAGCCG CGCCGAGCAG 90 0 

GACCTGTACC GCTGTGTGTC CCAGGCCCCG CGCGGCGGCG TCTCTAACTT CCCGGAGCTC 960 

ATCGTCAAGG AGCCCCCAAC TCCCATCGCG CCCCCACAGC TGCTGCGTGC TGGCCCCACC 1020 

TACCTCATCA TCCAGCTCAA CACCAACTCC ATCATTGGCG ACGGGCCGAT CGTGCGCAAG 1080 

GAGATTGAGT ACCGCATGGC GCGCGGGCCC TGGGCTGAGG TGCACGCCGT CAGCCTGCAG 1140 

ACCTACAAGC TGTGGCACCT CGACCCCGAC ACAGACTATG AGATCAGCGT GCTGCTCACG 1200 

CGTCCCGGAG ACGGCGGCAC TGGCCGCTGG GCCACCCCTC ATCAGCCGCA CCAAATGCGC 12 60 

AGAGCCCATG AGGGCCCCAA AGGCCTGGCT TTTGCTGAGA TCCAGGCCCG TCAGCTGACC 1320 

CTGCAGTGGG AACCACTGGG CTACAACGTG ACGCGTTGCC ACACCTATAC TGTGTCGCTG 1380 

TGCTATCACT AC AC CCTGGG CAGCAGCCAC AACCAGACCA TCCGAGAGTG TGTGAAGACA 1440 

GAGCAAGGTG TCAGCCGCTA CAC CATCAAG AACCTGCTGC CCTATCGGAA CGTTCACGTG 1500 

AGGCTTGTCC TCACTAACCC TGAGGGGCGC AAAGAGGGCA AGGAGGTCAC TTTCCAGACG 1560 

GATGAGGATG TGCCCAGTGG GATTGCAGCC GAGTCCCTGA CCTTCACTCC ACTGGAGGAC 162 0 
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ATGATCTTCC TCAAGTGGGA GGAGCCCCAG GAGCCCAATG GTCTCATCAC CCAGTATGAG 1630 

ATCAG CT AC C AGAGCATCGA GT CATCAGAC CCGGCAGTGA ACGTGCCAGG CCCACGACGT 174 0 

ACCATCTCCA AGCTCCGCAA TGAGACCTAC CATGTCTTCT CCAACCTGCA CCCAGGCACC 18 00 

ACCTACCTGT TCTCCGTGCG GGCCCGCACA GGCAAAGGCT TCGGCCAGGC GGCACTCACT 18 60 

GAGATAACCA CTAACATCTC TGCTCCCAGC TTTGATTATG CCGACATGCC GTCACCCCTG 1920 

GGCGAGTCTG AGAACACCAT CACCGTGCTG CTGAGGCCGG CACAGGGCCG CGGTGCGCCC 1980 

ATCAGTGTGT ACCAGGTGAT TGTGGAGGAG GAGCGGGCGC GAGGCTGCGG CGGGACGAGG 204 0 

TGGACAGGAC TGCTTCCCAG TGCCATTGAC CTTCGAGGCG GCGCTGGCCC CAGGCTGGTG 2100 

CACTACTTCG GGGCCGAACT GGCGGCCAGC AGTCTACCTG AGGCCATGCC CTTTACCGTG 2160 

GGTGACAACC AGACCTACCG AGGCTTCTGG AACCCACCAC TTGAGCCTAG GAAGGCCTAT 222 0 

CTCATCTACT TCCAGGCAGC AAGCCACCTG AAGGGGGAGA CCCGGCTGAA TTGCATCCGC 22 80 

ATTGCCAGGA AAGCTGCCTG CAAGGAAAGC AAGCGGCCCC TGGAGGTGTC CCAGAGATCG 234 0 

GAGGAGATGG GGCTTATCCT GGGCATCTGT GCAGGGGGGC TTGCTGTCCT CATCCTTCTC 24 00 

CTGGGTGCCA TCATTGTCAT CATCCGCAAA GGGAAGCCGG TGAACATGAC CAAGGCCACC 24 60 

GTCAACTACC GCCAGGAGAA GACACACATG ATCAGCGCCG TGGACCGCAG CTTCACAGAC 252 0 

CAGAGCACCC TGCAGGAGGA CGAGCGGCTG GGCCTGTCCT TCATGGACAC CCATGGCTAC 25 8 0 

AGCACCCGGG GAGACCAGCG CAGCGGTGGG GTCACTGAGG CCAGCAGCCT CCTGGGGGGC 264 0 

TCCCCGAGGC GTCCCTGTGG CCGGAAGGGC TCCCCATACC ACACGGGGCA GCTGCACCCT 2 700 

GCGGTGCGTG TCGCAGACCT TCTGCAGCAC ATCAACCAGA TGAAGACGGC CGAGGGTTAC 2760 

GGCTTCAAGC AGGAGTATGA GAGCTTCTTT GAAGGCTGGG ACGCCACAAA GAAGAAAGAC 2 82 0 

AAGGTCAAGG GCAGCCGGCA GGAGCCAATG CCTGCCTATG ATCGGCACCG AGTGAAACTG 2 880 

CACCCGATGC TGGGAGACCC CAATGCCGAC TACATTAATG CCAACTACAT AGATGGTTAC 2940 

CACAGGTCAA ACCACTTCAT AGCCACTCAA GGGCCGAAGC CTGAGATGGT CTATGACTTC 3 000 

TGGCGTATGG TGTGGCAGGA GCACTGTTCC AGCATCGTCA TGATCACCAA GCTGGTCGAG 3060 

GTGGGCAGGG TGAAATGCTC ACGGTACTGG CCGGAGGACT CAGACACCTA CGGGGACATC 312 0 

AAGATTATGC TGGTGAAGAC AGAGACCCTG GCTGAGTATG TCGTGCGCAC TTTTGCCCTG 3180 

GAGCGGAGAG GCTACTCTGC CCGGCACGAG GTCCGCCAGT CCCACTTCAC AGCGTGGCCA 324 0 

GAGCATGGCG TCCCCTACCA TGCCACGGGG CTGCTGGCTT TCATCCGGCG GGTGAAGGCC 33 00 

TCCACCCCAC CTGATGCCGG GCCCATTGTC ATCCACTGCA GCGCGGGCAC CGGCCGCACA 3 360 

CGTTGCTATA TCGTCCTGGA TGTGATGCTG GACATGGCAG AGTGTGAGGG CGTCGTGGAC 3420 

ATTTACAACT GTGTGAAGAC TCTCTGCTCC CGGCGTGTCA ACATGATCCA GACTGAGGAG 34 80 

C AGTAC AT CT T C ATTCATG A TGCAATCCTG GAGGCCTGCC TGTGTGGGGA GACCACCATC 3540 

CCTGTCAGTG AGTTCAAGGC CACCTACAAG GAGATGATCC GCATTGATCC TCAGAGTAAT 3 600 

TCCTCCCAGC TGCGGGAAGA GTTCCAGACG CTGAACTCGG TCACCCCGCC GCTGGACGTG 3 660 

GAGGAGTGCA GCATCGCCCT GTTGCCCCGG AACCGCGACA AGAACCGCAG CATGGACGTC 3720 

CTGCCGCCCG ACCGCTGCCT GCCCTTCCTC ATCTCCACTG ATGGGGACTC CAACAACTAC 3780 

ATTAATGCAG CCCTGACTGA CAGCTACACA CGGAGGTCGG CCTTCATGGT GACCCTGCAC 3 840 

CCGCTGCAGA GCACCACGCC CGACTTCTGG CGGCTGGTCT ACGATTACGG GTGCACCTCC 3 900 

ATCGTCATGC TCAACCAGCT GAACCAGTCC AACTCCGCCT GGCCCTGCCT GCAGTACTGG 3 960 

CCAGAGCCAG GCCGGCAGCA ATATGGCCTC ATGGAGGTGG AGTTTATGTC GGGCACAGCT 4 020 

GATGAAGACT TAGTGGCTCG AGTCTTCCGG GTGCAGAACA TCTCTCGGTT GCAGGAGGGA 4 080 

GACCTGCTGG TGCGGCACTT CCAGTTCCTG CGCTGGTCTG CATACCGGGA CACACCTGAC 4140 

TCCAAGAAGG. CCTTCTTGCA CCTGCTGGCT GAGGTGGACA AGTGGCAGGC CGAGAGTGGG 4200 

GATGGGCGCA CCATCGTGCA CTGCCTAAAC GGGGGAGGAC GCAGCGGCAC CTTCTGCGCC 4260 

TGCGCCACGG TCCTGGAGAT GATCCGCTGC CACAACTTGG TGGACGTTTT CTTTGC.TGCC 4320 

CAAACCCTCC GGAACTACAA ACCCAACATG GTGGAGACCA TGGATCAGTA CCACTTTTGC 4380 

TACGATGTGG CCCTGGAGTA CTTGGAGGGG CTGGAGTCAA GATAGCGGGG CCCTGGCCTG 4440 

GGGCACCCAC TGCACACTCA GGGCCAGACC CACCATCCTG GACTGGCGAG GAAGATCAGT 4500 

GCCTCCTGCT CTGCCCAAAC ACACTCCCAT GGGGCAAGCA CTGGAGTGGA TGCTGGGCTA 4 560 

TCTTGCTCCC CCTTCCACTG TGGGCAGGGC CTTTCGCTTG TCC CATGGGC GGGTGGTGGG 4 620 

CCAAGGAGGA GCTTAGCAAG TCTGCACCCC ACCCCCACCT CCATAGGGTC CTGCAGGCCT 4 680 

GTGCTGAGAG GCCTGGTGCT GCCTGGCAGA GTGACAAAGG CTCAGGACGG CTGGCTCTGG 4 740 

GGG ACT CAGG CCAAGGGGGT TGGCAGGATC CTGGGTTTTG GGAGGGATGA GTGAGGC CCT 4 800 

GCAGAGAGCA TCCCAGGCCA AGGTTCCCAC TCAGCCTGCC CCCTCTGCAT GTGGGTAGAG 4 860 

GATGTACTGG GACTTGGCAT TTAGGATTCC ATCTGGGGGA CCCCCTGAAG GTCCCCCCCA 4920 

AGCAGGTCTC AATTCTGATA GCCAGTGGGG CACACTGACT GTCCTCCCCA GGGGAACTGC 4 980 

AGCGCCCTCC TCCCCACTGC CCCCTCCAGC CCCTGAGATA TTTTGCTCAC TATCCCTCCC 5 040 

CACTTGCTTC CCTGATATGT GCT CTGACTT CCCTGAACCA GGATCTGCCT . ATTACTGCTG 5100 

TCCCATGGGG GGCTCCTTCC CTGCCTGACC CACTGTTGCA GAATGAAGTC *ACCTCGCCCC 5160 

CCTCTTCCTT TAATCTTCAG GCCTCACTGG CCTGTCCTGC TCAGCTTGGG CCAGTGACAA 5220 

TCTGCAAGGC TGAACAACAG CCCCTGGGGT TGAGGCCCCT GTGGCTCCTG GTCAGGCTGC 52 80 

CCGTTGTGGG GAGGGGCAGT GTT AG AG CAG GGCTGGTCAT ACCCTCTGGA GTTCAGAGCA 5340 
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AGAGGTAGGA CCAGTGCTTT TTTGTTTCTT TTGTTATTTT TGGTTGGGTG GGTGGGAAGG 54 00 

TCTCTTTAAA ATGGGGCAGG CCACACCCCC ATTCCGTGCC TCAATTTCCC CATC TGTAAA 54 60 

CTGTAGATAT GACTACTGAC CTACCTCGCA GGGGGCTGTG GGGAGGCATA AGCTGATGTT 5520 

TGTAAAGCGC TTTGTAAATA AACGTGCTCT CTGAATGCCA AAAAAAAAAA AACAAAAAAA 55 80 

A 5581 



(2) INFORMATION FOR SEQ ID NO : 34: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1430 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Met Ala Arg Ala Gin Ala Leu Val Leu Ala Leu Thr Phe Gin Leu Cys 
1 5 10 15 

Ala Pro Glu Thr Glu Thr Pro Ala Ala Gly Cys Thr Phe Glu Glu Ala 
20 25 30 

Ser Asp Pro Ala Val Pro Cys Glu Tyr Ser Gin Ala Gin Tyr Asp Asp 
35 40 45 

Phe Gin Trp Glu Gin Val Arg lie His Pro Gly Thr Arg Ala Pro Ala 
50 55 60 

Asp Leu Pro His Gly Ser Tyr Leu Met Val Asn Thr Ser Gin His Ala 
65 70 75 80 

Pro Gly Gin Arg Ala His Val He Phe Gin Ser Leu Ser Glu Asn Asp 
85 90 95 

Thr His Cys Val Gin Phe Ser Tyr Phe Leu Tyr Ser Arg Asp Gly Thr 
100 105 110 

Gly Gly Thr Leu Arg Val Tyr Val Arg Val Asn Gly Gly Pro Leu Ala 
115 120 125 

Ser Ala Val Trp Asn Met Thr Gly Ser His Gly Arg Gin Trp His Gin 
130 135 140 

Ala Glu Leu Ala Val Ser Thr Phe Trp Pro Asn Glu Tyr Gin Val Leu 
145 150 155 160 

Phe Glu Ala Leu He Ser Pro Asp Arg Arg Gly Tyr Met Gly Leu Asp 
165 170 175 

Asp He Leu Leu Leu Ser Tyr Pro Cys Ala Lys Ala Pro His Phe Ser 
180 185 * 190 

Arg Leu Gly Asp Val Glu Val Asn Ala Gly Gin Asn Ala Ser Phe Gin 
195 200 205 

Cys Met Ala Ala Gly Glu Pro Met Arg Gin Arg Phe Leu Leu Gin Arg 
210 215 220 
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Gin Ser Gly Ala Leu Val Pro Ala Gly Ala Phe Gly Thr Ser Ala Thr 
225 230 235 240 

Gly Phe Leu Ala Thr Phe Pro Leu Ala Ala Val Ser Arg Ala Glu Gin 
245 250 255 

Asp Leu Tyr Arg Cys Val Ser Gin Ala Pro Arg Gly Gly Val Ser Asn 
260 265 270 

Phe Pro Glu Leu lie Val Lys Glu Pro Pro Thr Pro lie Ala Pro Pro 
275 280 . 285 

Gin Leu Leu Arg Ala Gly Pro Thr Tyr Leu lie lie Gin Leu Asn Thr 
290 295 300 

Asn Ser lie lie Gly Asp Gly Pro lie Val Arg Lys Glu lie Glu Tyr 
305 310 315 320 

Arg Met Ala Arg Gly Pro Trp Ala Glu Val His Ala Val Ser Leu Gin 
325 330 335 

Thr Tyr Lys Leu Trp His Leu Asp Pro Asp Thr Asp Tyr Glu He Ser 
340 345 350 

Val Leu Leu Thr Arg Pro Gly Asp Gly Gly Thr Gly Arg Trp Ala Thr 
355 360 365 

Pro His Gin Pro His Gin Met Arg Arg Ala His Glu Gly Pro Lys Gly 
370 375 380 

Leu Ala Phe Ala Glu He Gin Ala Arg Gin Leu Thr Leu Gin Trp Glu 
385 390 395 400 

Pro Leu Gly Tyr Asn Val Thr Arg Cys His Thr Tyr Thr Val Ser Leu 
405 410 ~ 415 

Cys Tyr His Tyr Thr Leu Gly Ser Ser His Asn Gin Thr He Arg Glu 
420 425 430 

Cys Val Lys Thr Glu Gin Gly Val Ser Arg Tyr Thr He Lys Asn Leu 
435 440 445 

Leu Pro Tyr Arg Asn Val His Val Arg Leu Val Leu Thr Asn Pro Glu 
450 455 460 

Gly Arg Lys Glu Gly Lys Glu Val Thr Phe Gin Thr Asp Glu Asp Val 
465 470 475 480 

Pro Ser Gly He Ala Ala Glu Ser Leu Thr Phe Thr Pro Leu Glu Asp 
485 490 495 

Met He Phe Leu Lys Trp Glu Glu Pro Gin Glu Pro Asn Gly Leu He 
500 505 510 

Thr Gin Tyr Glu He Ser Tyr Gin Ser He Glu Ser Ser Asp Pro Ala 
515 520 525 

Val Asn Val Pro Gly Pro Arg Arg Thr He Ser Lys Leu Arg Asn Glu 
530 535 540 
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Thr Tyr 
545 

Ser Val 
Glu He 
Pro Ser 



His Val Phe Ser Asn Leu His Pro Gly Thr Thr 
550 55S 



Arg Ala Arg Thr Gly Lys Gly Phe Gly Gin Ala 
565 570 



Thr Thr Asn He Ser Ala Pro Ser Phe Asp Tyr 
580 585 



Pro Leu Gly Glu Ser Glu Asn Thr He 
595 600 



Tyr Leu Phe 
560 



Ala Asp Met 
590 



Leu Leu Arg 
. Val He Val 



Glu Glu ( 
625 

Leu Pro 

His Tyr 

Pro Phe 

Pro Leu 
690 

His Leu 
705 

Ala Ala 

Glu Glu 

Leu He 

Pro Val 
770 

His Met 
785 

Gin Glu 
Ser Thr 
Leu Leu 



Phe Gly Ala Glu Leu Ala Ala Ser Ser 
660 665 



Thr Val Gly Asp Asn Gin Thr Tyr Arg ( 
675 680 



i Glu Pro Arg Lys Ala Tyr Leu He Tyr i 
i 695 



Arg Trp 
Gly Pro 
Leu Pro 



Thr Gly Leu 
640 



Arg Leu Val 
655 



: Trp Asn Pro 
t Ala Ala Ser 



Lys Gly Glu Thr Arg Leu Asn Cys He 
710 715 



Met Gly Leu He Leu Gly He Cys Ala 
740 745 



Leu Leu Leu Gly Ala He He Val He 
755 760 



Asn Met Thr Lys Ala Thr Val Asn Tyr 
775 



Asp Glu Arg Leu Gly Leu Ser Phe Met 
805 810 



Arg Gly Asp Gin Arg Ser Gly Gly Val 
820 825 



Arg He i 

Val Ser < 

Gly Gly I 

He Arg 
765 

Arg Gin 
780 

Asp Gin 
Asp Thr 
Thr Glu 



Lys Gly Lys 
Glu Lys Thr 



Gly Gly Ser Pro Arg Arg Pro Cys Gly i 
835 840 



Thr Gly Gin Leu His Pro Ala Val Arg 1 
855 I 



He Asn Gin Met Lys Thr Ala Glu Gly ' 
870 875 



i Gly Ser Pro 
i Asp Leu Leu 
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Glu Tyr Glu Ser Phe Phe Glu Gly Trp Asp Ala Thr Lys Lys Lys Asp 
885 890 895 

Lys Val Lys Gly Ser Arg Gin Glu Pro Met Pro Ala Tyr Asp Arg His 
900 905 910 

Arg Val Lys Leu His Pro Met Leu Gly Asp Pro Asn Ala Asp Tyr lie 
915 920 925 

Asn Ala Asn Tyr lie Asp Gly Tyr His Arg Ser Asn His Phe lie Ala 

930 935 940 

Thr Gin Gly Pro Lys Pro Glu Met Val Tyr Asp Phe Trp Arg Met Val 
94S 950 955 ' 960 

Trp Gin Glu His Cys Ser Ser He Val Met He Thr Lys Leu Val Glu 
965 970 975 



Thr Gly Arg Thr Arg Cys Tyr He Val Leu Asp Val Met Leu Asp Met 
1075 " 1080 " 1085 



Cys Ser Arg Arg Val Asn Met He Gin Thr Glu Glu Gin Tyr He Phe 

1105 1110 1115 1120 

He His Asp Ala He Leu Glu Ala Cys Leu Cys Gly Glu Thr Thr He 
1125 1130 1135 

Pro Val Ser Glu Phe Lys Ala Thr Tyr Lys Glu Met He Arg He Asp 
1140 1145 1150 



Ser Val Thr Pro Pro Leu Asp Val Glu Glu Cys Ser He Ala Leu Leu 
1170 1175 1180 

Pro Arg Asn Arg Asp Lys Asn Arg Ser Met Asp Val Leu Pro Pro Asp 
1185 1190 1195 1200 
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Arg Cys Leu Pro Phe Leu lie Ser Thr Asp Gly Asp Ser Asn Asn Tyr 
1205 1210 1215 

lie Asn Ala Ala Leu Thr Asp Ser Tyr Thr Arg Arg Ser Ala Phe Met 
1220 1225 1230 

Val Thr Leu His Pro Leu Gin Ser Thr Thr Pro Asp Phe Trp Arg Leu 
1235 1240 1245 

Val Tyr Asp Tyr Gly Cys Thr Ser lie Val Met Leu Asn Gin Leu Asn 
1250 1255 1260 

Gin Ser Asn Ser Ala Trp Pro Cys Leu Gin Tyr Trp Pro Glu Pro Gly 
1265 1270 1275 1280 

Arg Gin Gin Tyr Gly Leu Met Glu Val Glu Phe Met Ser Gly Thr Ala 
1285 1290 1295 

Asp Glu Asp Leu Val Ala Arg Val Phe Arg Val Gin Asn He Ser Arg 
1300 1305 1310 

Leu Gin Glu Gly Asp Leu Leu Val Arg His Phe Gin Phe Leu Arg Trp 
1315 1320 1325 

Ser Ala Tyr Arg Asp Thr Pro Asp Ser Lys Lys Ala Phe Leu His Leu 
1330 1335 1340 

Leu Ala Glu Val Asp Lys Trp Gin Ala Glu Ser Gly Asp Gly Arg Thr 
1345 1350 1355 1360 

He Val His Cys Leu Asn Gly Gly Gly Arg Ser Gly Thr Phe Cys Ala 
1365 1370 1375 

Cys Ala Thr Val Leu Glu Met He Arg Cys His Asn Leu Val Asp Val 
1380 1385 1390 

Phe Phe Ala Ala Gin Thr Leu Arg Asn Tyr Lys Pro Asn Met Val Glu 
1395 1400 1405 

Thr Met Asp Gin Tyr His Phe Cys Tyr Asp Val Ala Leu Glu Tyr Leu 
1410 1415 1420 

Glu Gly Leu Glu Ser Arg 
1425 1430 



(2) INFORMATION FOR SEQ ID NO: 35: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 810 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

GAATTCGGCA CGAGCGGGCT GGACCTTGCT CGCCCGCGGC GCCATGAGCC GCAGC CTGG A 60 
CTCGGCGCGG AGCTTCCTGG AGCGGCTGGA AGCGCGGGGC GGCCGGGAGG GGGCAGTCCT 120 
CGCCGGCGAG TTCAGCGACA TCCAGGCCTG CTCGGCCGCC TGGAAGGCTG ACGGCGTGTG 18 0 
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CTCCACCGTG GCCGGCAGTC GGCCAGAGAA CGTGAGGAAG AACCGCTACA AAGACGTGCT 24 0 

GCCTTATGAT CAGACGCGAG TAATCCTCTC CCTGCTCCAG GAAGAGGGAC ACAGCGACTA 30 0 

CATTAATGGC AACTTCATCC GGGGCGTGGA TGGAAGCCTG GCCTACATTG CCACGCAAGG 3 60 

ACCCTTGCCT CACACCCTGC TAGACTTCTG GAGACTGGTC TGGGAGTTTG GGGTCAAGGT 42 0 

GATCCTGATG GCCTGTCGAG AGATAGAGAA TGGGCGGAAA AGGTGTGAGC GGTACTGGGC 480 

CCAGGAGCAG GAGCCACTGC AGACTGGGCT TTTCTGCATC ACTCTGATAA AGGAGAAGTG 54 0 

GCTGAATGAG GACATCATGC TCAGGAC CCT CAAGGTCACA TTCCAGAAGG AGTCCCGTTC 600 

TGTGTACCAG CTACAGTATA TGTCCTGGCC AGACCGTGGG GTCCCCAGCA GTCCTGACCA 660 

CATGCTCGCC ATGGTGGAGG AAGCCCGTCG CCTCCAGGGA TCTGGCCCTG AACCCCTCTG 72 0 

TGTCCACTGC AGTGCGGGTT GTGGGCGAAC AGGCGTCCTG TGCACCGTGG ATTATGTGAG 780 

GCAGCTGCTC CTGACCCAGA TGATCCCACC TGACTTCAGT CTCTTTGATG TGGTCCTTAA 84 0 

GATGAGGAAG CAGCGGCCTG CGGCCGTGCA GACAGAGGAG CAGTACAGGT TCCTGTACCA 900 

CACGGTGGCT CAGATGTTCT GCTCCACACT CCAGAATGCC AGCCCCCACT ACCAGAACAT 960 

CAAAGAGAAT TGTGCCCCAC TCTACGACGA TGCCCTCTTC CTCCGGACTC CCCAGGCACT " 102 0 

TCTCGCCATA CCCCGCCCAC CAGGAGGGGT CCTCAGGAGC ATCTCTGTGC CCGGGTCCCC 108 0 

GGGCCACGCC ATGGCTGACA CCTACGCGGA GGAGCAGAAG CGCGGGGCTC CAGCGGGCGC 114 0 

CGGGAGTGGG ACGCAGACGG GGACGGGGAC GGGGGCGCGC AGCGCGGAGG AGGCGCCGCT 12 0 0 

CTACAGCAAG GTGACGCCGC GCGCCCAGCG ACCCGGGGCG CACGCGGAGG ACGC GAGGGG 1260 

GACGCTGCCT GGCCGCGTTC CTGCTGACCA AAGTCCTGCC GGATCTGGCG CCTACGAGGA 13 20 

CGTGGCGGGT GGAGCTCAGA CCGGTGGGCT AGGTTTCAAC CTGCGCATTG GGAGGCCGAA 13 80 

GGGTCCCCGG GACCCGCCTG CTGAGTGGAC CCGGGTGTAA GTCTAACGCC AGTTCCTGCC 144 0 

TGTTGCCTCT TGTGAGCTCG GACTGCTGAT GCCCCGGTGC TGCTGAGCGC CGTGCCGAGA 15 00 

ATGGAAACAG TGGGC CTGG A TCAAAGTTAA AGTTTCTCAG GGTGGGAAAT GTGGGGGCTT 15 60 

TGCCCAATGA CTGTAGCATT CAAGGCTTGA GGCTGGAGGA GGTAGCTAGG GTATAGTGGC 1620 

TGGTGAGGCT GCACAGAGCA GATTCAAGAA AGAAGATCAG GAAGGGGCAT GACCCCTGAG 1680 

TTATGAAGGG GAGAAGGGAC AGATGAGCTT CCGGAGACTG CTCTCCTCAC CACACAGCAC 174 0 

TAGTCCATCC TCAGCACCTG AGCCTCCCTC ACTTGGACAC T CAGGGG AC C ACACAGAGAA 1800 

GTGGATGGAC ACTTCGCCAT CCAGGCAGAA CTAAGCCAGG CATAACCACA GCCAAGCAGA 1860 

TTAACCCCAG GCAGACCGAT AAAAAGACCT CCAGATAGGC AGACAGACAG ATGGACCACC 1920 

AACCTGGACA GACAGCCAAA GCTTCAGAGA TACAGTCCAC AGGTGGACAA AGGATCCCCC 1980 

AGCCAGAGAG AGAGAGACCA GCCAACAGCT TGATAGACCA GTGCAGCCAG AGAGACCACC 2 04 0 

AAACACAGCC CCCAAAAGAC AGACATCTCT GCTAGCTGGA CAGCCAGGTG GACCCCCTAA 2100 

GTTAGTCAGA TTACTAGACA GATATAAACA GATCCCCTGC TGAACAGATA TACAGAGTTC 2160 

TCAGACCCCA CTCCCTCAGG TGGGCTGGCT GGCTGACAGA CCTTCTGGCC AGACAGACTC 2220 

CTAACCAACC AGATGGACTG CCAGACAGGC AG AC AT C AGT CCACATGGAA TCCTGACATC 22 80 

CCAGCCAGCC GGCCAGACTC TCATCTTGAT GTCTTGATGG ATGGACCCCA GCTAGTCAGA 23 4 0 

CATGATCCTC CAGATTGACA GACAAGTCCC CCAAATGAGT ACACATCTCC AGCTATTCAG 24 00 

ACAGATGGAG CCCCAGCAAA TCAGGACCTA TCTAGGCAGA CCCCAGCCAG ACCCCCGCCA 24 60 

GACAGACTCC CAACCAGACT GACCCCTTGC TGTTCACACA GCCTGCCGAG TAGCTGGGAC 2520 

TACAGGTCTA ATTTTTTTTT TTTTTAAGAA ATGAGTTTTT GCCATGTTGC CCAGACTGGT 25 80 

CTTGAACTCC CAACCTCAAG CAATCCTCCT GCCTCAGCCT CCCAAAGTGC TGAGATTACA 2640 

GGTGTGAGCC AC C AGGCTCA GCCCCCTAAG ATTTGAAACA CTTTAAATGG CCCATGGTAG 2700 

GGTTCCTGCT AGGATAAAAC ATTAAGTGGC TGTTAAAAGA AATAAAAGGA GGACACGTCT 2760 

CTGTGCAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2 810 



(2) INFORMATION FOR SEQ ID NO : 36: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 458 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 36: 

Met Ser Arg Ser Leu Asp Ser Ala Arg Ser Phe Leu Glu Arg Leu Glu 
15 10 15 

Ala Arg Gly Gly Arg Glu Gly Ala Val Leu Ala Gly Glu Phe Ser Asp 
20 25 30 

lie Gin Ala Cy3 Ser Ala Ala Trp Lys Ala Asp Gly Val Cys Ser Thr 
35 40 45 

Val Ala Gly Ser Arg Pro Glu Asn Val Arg Lys Asn Arg Tyr Lys Asp 
50 55 60 

Val Leu Pro Tyr Asp Gin Thr Arg Val lie Leu Ser Leu Leu Gin Glu 
65 70 75 80 

Glu Gly His Ser Asp Tyr lie Asn Gly Asn Phe lie Arg Gly Val Asp 
85 90 95 

Gly Ser Leu Ala Tyr lie Ala Thr Gin Gly Pro Leu Pro His Thr Leu 
100 105 110 

Leu Asp Phe Trp Arg Leu Val Trp Glu Phe Gly Val Lys Val lie Leu 
115 ~ 120 125 

Met Ala Cys Arg Glu lie Glu Asn Gly Arg Lys Arg Cys Glu Arg Tyr 
130 135 140 

Trp Ala Gin Glu Gin Glu Pro Leu Gin Thr Gly Leu Phe Cys He Thr 
145 150 155 160 

Leu lie Lys Glu Lys Trp Leu Asn Glu Asp He Met Leu Arg Thr Leu 
165 170 175 

Lys. Val Thr Phe Gin Lys Glu Ser Arg Ser Val Tyr Gin Leu Gin Tyr 
180 185 190 

Met Ser Trp Pro Asp Arg Gly Val Pro Ser Ser Pro Asp His Met Leu 
195 200 205 

Ala Met Val Glu Glu Ala Arg Arg Leu Gin Gly Ser Gly Pro Glu Pro 
210 215 220 

Leu Cys Val His Cys Ser Ala Gly Cys Gly Arg Thr Gly Val Leu Cys 
225 230 235 240 

Thr Val Asp Tyr Val Arg Gin Leu Leu Leu Thr Gin Met He Pro Pro 
245 250 255 

Asp Phe Ser Leu Phe Asp Val Val Leu Lys Met Arg Lys Gin Arg Pro 
260 265 270 

Ala Ala Val Gin Thr Glu Glu Gin Tyr Arg Phe Leu Tyr His Thr Val 
275 280 285 

Ala Gin Met Phe Cys Ser Thr Leu Gin Asn Ala Ser Pro His Tyr Gin 
290 295 300 

Asn He Lys Glu Asn Cys Ala Pro Leu Tyr Asp Asp Ala Leu Phe Leu 
305 310 315 320 
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Arg Thr Pro Gin Ala Leu Leu Ala He Pro Arg Pro Pro Gly Gly Val 
325 330 335 

Leu Arg Ser He Ser Val Pro Gly Ser Pro Gly His Ala Met Ala Asp 
340 345 350 

Thr Tyr Ala Glu Glu Gin Lys Arg Gly Ala Pro Ala Gly Ala Gly Ser 
355 360 365 

Gly Thr Gin Thr Gly Thr Gly Thr Gly Ala Arg Ser Ala Glu Glu Ala 
370 375 380 

Pro Leu Tyr Ser Lys Val Thr Pro Arg Ala Gin Arg Pro Gly Ala His 
385 " 390 ~ 395 400 

Ala Glu Asp Ala Arg Gly Thr Leu Pro Gly Arg Val Pro Ala Asp Gin 
405 410 415 

Ser Pro Ala Gly Ser Gly Ala Tyr Glu Asp Val Ala Gly Gly Ala Gin 
420 425 430 

Thr Gly Gly Leu Gly Phe Asn Leu Arg He Gly Arg Pro Lys Gly Pro 
435 440 445 

Arg Asp Pro Pro Ala Glu Trp Thr Arg Val 
450 455 



(2) INFORMATION FOR SEQ ID NO : 37: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 503 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO-: 37: 

Met Glu Pro Ala Gly Pro Ala Pro Gly Arg Leu Gly Pro Leu Leu Cys 
1 5 -10 15 

Leu Leu Leu Ala Ala Ser Cys Ala Trp Ser Gly Val Ala Gly Glu Glu 
20 25 30 

Glu Leu Gin Val He Gin Pro Asp Lys Ser Val Ser Val Ala Ala Gly 
35 40 45 

Glu Ser Ala He Leu His Cys Thr Val Thr Ser Leu He Pro Val Gly 
50 55 60 

Pro He Gin Trp Phe Arg Gly Ala Gly Pro Ala Arg Glu Leu He Tyr 
65 70 75 80 

Asn Gin Lys Glu Gly His Phe Pro Arg Val Thr Thr Val Ser Glu Ser 
85 90 95 

Thr Lys Arg Glu Asn Met Asp Phe Ser He Ser He Ser Asn He Thr 
100 105 110 
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Pro Ala Asp Ala Gly Thr Tyr Tyr Cys Val Lys Phe Arg Lys Gly Ser 
115 120 125 

Pro Asp Thr Glu Phe Lys Ser Gly Ala Gly Thr Glu Leu Ser Val Arg 
130 135 140 

Ala Lys Pro Ser Ala Pro Val Val Ser Gly Pro Ala Ala Arg Ala Thr 
145 150 155 160 

Pro Gin His Thr Val Ser Phe Thr Cys Glu Ser His Gly Phe Ser Pro 
165 170 175 



Phe Gin Thr Asn Val Asp Pro Val Gly Glu Ser Val Ser Tyr Ser He 
195 200 205 

His Ser Thr Ala Lys Val Val Leu Thr Arg Glu Asp Val His Ser Gin 
210 215 220 

•Val He Cys Glu Val Ala His Val Thr Leu Gin Gly Asp Pro Leu Arg 
225 230 235 240 

Gly Thr Ala Asn Leu Ser Glu Thr He Arg Val Pro Pro Thr Leu Glu 
245 250 255 

Val Thr Gin Gin Pro Val Arg Ala Glu Asn Gin Val Asn Val Thr Cys 
260 265 270 

Gin Val Arg Lys Phe Tyr Pro Gin Arg Leu Gin Leu Thr Trp Leu Glu 
275 280 285 

Asn Gly Asn Val Ser Arg Thr Glu Thr Ala Ser Thr Val Thr Glu Asn 
290 295 300 

Lys Asp Gly Thr Tyr Asn Trp Met Ser Trp Leu Leu Val Asn Val Ser. 
305 310 315 320 

Ala His Arg Asp Asp Val Lys Leu Thr Cys Gin Val Glu His Asp Gly 
325 330 335 

Gin Pro Ala Val Ser Lys Ser His Asp Leu Lys Val Ser Ala His Pro 
340 345 350 

Lys Glu Gin Gly Ser Asn Thr Ala Ala Glu Asn Thr Gly Ser Asn Glu 
355 360 365 

Arg Asn He Tyr He Val Val Gly Val Val Cys Thr Leu Leu Val Ala 
370 375 380 

Leu Leu Met Ala Ala Leu Tyr Leu Val Arg He Arg Gin Lys Lys Ala 
385 390 395 400 

Gin Gly Ser Thr Ser Ser Thr Arg v Leu His Glu Pro Glu Lys Asn Ala 
405 410 415 

Arg Glu He Thr Gin Asp Thr Asn Asp He Thr Tyr Ala Asp Leu Asn 
420 425 430 
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Leu Pro Lys Gly Lys Lys Pro Ala Pro Gin Ala Ala Glu Pro Asn Asn 
435 440 445 

His Thr Glu Tyr Ala Ser lie Gin Thr Ser Pro Gin Pro Ala Ser Glu 
450 455 460 

Asp Thr Leu Thr Tyr Ala Asp Leu Asp Met Val His Leu Asn Arg Thr 
465 470 475 480 

Pro Lys Gin Pro Ala Pro Lys Pro Glu Pro Ser Phe Ser Glu Tyr Ala 
485 490 495 

Ser Val Gin Val Pro Arg Lys 
500 



(2) INFORMATION FOR SEQ ID NO : 38: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 98 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Met Pro Val Pro Ala Ser Trp Pro His Leu Pro Ser Pro Phe Leu Leu 
15 10 15 

Met Thr Leu Leu Leu Gly Arg Leu Thr Gly Val Ala Gly Glu Asp Glu 
20 25 30 

Leu Gin Val He Gin Pro Glu Lys Ser Val Ser Val Ala Ala Gly Glu 
35 40 45 

Ser Ala Thr Leu Arg Cys Ala Met Thr Ser Leu He Pro Val Gly Pro 
50 55 60 

He Met Trp Phe Arg Gly Ala Gly Ala Gly Arg Glu Leu He Tyr Asn 
65 70 75 80 

Gin Lys Glu Gly His Phe Pro Arg Val Thr Thr Val Ser Glu Leu Thr 
85 90 95 

Lys Arg Asn Asn Leu Asn Phe Ser He Ser He Ser Asn He Thr Pro 
100 105 110 

Ala Asp Ala Gly Thr Tyr Tyr Cys Val Lys Phe Arg Lys Gly Ser Pro 
115 120 125 

Asp Asp Val Glu. Phe Lys Ser Gly Ala Gly Thr Glu Leu Ser Val Arg 
130 135 140 

Ala Lys Pro Ser Ala Pro Val Val Ser Gly Pro Ala Val Arg Ala Thr 
145 150 155 160 

Pro Glu His Thr Val Ser Phe Thr Cys Glu Ser His Gly Phe Ser Pro 
165 170 175 
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Arg Asp lie Thr Leu Lys Trp Phe Lys Asn Gly Asn Glu Leu Ser Asp 
180 185 190 

Phe Gin Thr Asn Val Asp Pro Ala Gly Asp Ser Val Ser Tyr Ser lie 
195 200 205 

His Ser Thr Ala Arg Val Val Leu Thr Arg Gly Asp Val His Ser Gin 
210 215 220 

Val lie Cys Glu Met Ala His He Thr Leu Gin Gly Asp Pro Leu Arg 
225 230 235 ~ 240 

Gly Thr Ala Asn Leu Ser Glu Ala He Arg Val Pro Pro Thr Leu Glu 
245 250 255 

Val Thr Gin Gin Pro Met Arg Ala Glu Asn Gin Ala Asn Val Thr Cys 
260 265 270 

Gin Val Ser Asn Phe Tyr Pro Arg Gly Leu Gin Leu Thr Trp Leu Glu 
275 280 285 

Asn Gly Asn Val Ser Arg Thr Glu Thr Ala Ser Thr Leu Thr Glu Asn 
290 295 300 

Lys Asp Gly Thr Tyr Asn Trp Met Ser Trp Leu Leu Val Asn Thr Cys 
305 310 315 320 

Ala His Arg Asp Asp Val Val Leu Thr Cys Gin Val Glu His Asp Gly 
325 330 335 

Gin Gin Ala Val Ser Lys Ser Tyr Ala Leu Glu He Ser Ala His Gin 
340 345 350 

Lys Glu His Gly Ser Asp He Thr His Glu Pro Ala Leu Ala Pro Thr 
355 360 365 

Ala Pro Leu Leu Val Ala Leu Leu Leu Gly Pro Lys Leu Leu Leu Val 
370 375 380 

Val Gly Val Ser Ala He Tyr He Cys Trp Lys Gin Lys Ala 
385 390 395 
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